context matter
Context Matters: A Strategy to Pre-train Language Model for Science Education
Liu, Zhengliang, He, Xinyu, Liu, Lei, Liu, Tianming, Zhai, Xiaoming
This study aims at improving the performance of scoring student responses in science education automatically. BERT-based language models have shown significant superiority over traditional NLP models in various language-related tasks. However, science writing of students, including argumentation and explanation, is domain-specific. In addition, the language used by students is different from the language in journals and Wikipedia, which are training sources of BERT and its existing variants. All these suggest that a domain-specific model pre-trained using science education data may improve model performance. However, the ideal type of data to contextualize pre-trained language model and improve the performance in automatically scoring student written responses remains unclear. Therefore, we employ different data in this study to contextualize both BERT and SciBERT models and compare their performance on automatic scoring of assessment tasks for scientific argumentation. We use three datasets to pre-train the model: 1) journal articles in science education, 2) a large dataset of students' written responses (sample size over 50,000), and 3) a small dataset of students' written responses of scientific argumentation tasks. Our experimental results show that in-domain training corpora constructed from science questions and responses improve language model performance on a wide variety of downstream tasks. Our study confirms the effectiveness of continual pre-training on domain-specific data in the education domain and demonstrates a generalizable strategy for automating science education tasks with high accuracy. We plan to release our data and SciEdBERT models for public use and community engagement.
- North America > United States > Georgia > Clarke County > Athens (0.14)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Context Matters: Why AI is (still) bad at making decisions.
A man and a woman live together in a single household. Every time the man speaks, he speaks with impatience, cursing loudly at you when you don't understand his questions. He's gruff bordering on angry, only talks to you late at night, and only asks about nightclubs and bars. Every time the woman speaks, she either speaks in tears or in monotone. Her questions are more housework related -- recipes, shopping reminders, financial questions. She talks to you throughout the day, and often asks about mental health problems.
Context Matters in Data-Centric NLP
In model-centric approaches to AI, you accept the data you're given and focus on iteratively improving your model. In data-centric AI, however, you focus on improving your data instead. One important aspect: the context your data comes from, and whether that's captured by your features and labels… This article (cross-posted from the Surge AI blog) goes into 5 examples where context-sensitive features and context-sensitive labels are crucial for AI applications. Is that an insult or a compliment? I can't believe these kids support sweatshops.
- North America > United States (0.05)
- Asia > India (0.05)
Tinder Swipes Right on AI to Help Stop Harassment
On Tinder, an opening line can go south pretty quickly. And while there are plenty of Instagram accounts dedicated to exposing these "Tinder nightmares," when the company looked at its numbers, it found that users reported only a fraction of behavior that violated its community standards. Now, Tinder is turning to artificial intelligence to help people dealing with grossness in the DMs. The popular online dating app will use machine learning to automatically screen for potentially offensive messages. If a message gets flagged in the system, Tinder will ask its recipient: "Does this bother you?"
Context Matters When Text Mining
Many times the most followed approach can result in failure. The reason has more to do with thinking that one approach works in all cases. This is specially true in text mining. For instance, a common approach in clustering documents is to create tf-idf matrix for all documents, use SVD or other dimension reduction algorithm and then use a clustering. In most cases, this will work; However, as I will present here, there are instances where this process will not provide the intended result.
Context Matters When Text Mining
Many times the most followed approach can result in failure. The reason has more to do with thinking that one approach works in all cases. This is specially true in text mining. For instance, a common approach in clustering documents is to create tf-idf matrix for all documents, use SVD or other dimension reduction algorithm and then use a clustering. In most cases, this will work; However, as I will present here, there are instances where this process will not provide the intended result.
Context Matters When Text Mining
Many times the most followed approach can result in failure. The reason has more to do with thinking that one approach works in all cases. This is specially true in text mining. For instance, a common approach in clustering documents is to create tf-idf matrix for all documents, use SVD or other dimension reduction algorithm and then use a clustering. In most cases, this will work; However, as I will present here, there are instances where this process will not provide the intended result.